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ABSTRACT 

Forthcoming instruments designed for high-cadence large-area surveys, such as the Dark En- 
ergy Survey and Large Synoptic Survey Telescope, will generate several GB of data products 
every few minutes during survey operations. Since such surveys are designed to operate with 
minimal observer interaction, automated real-time analysis of these large images is necessary to 
ensure uninterrupted production of science-quality data. We describe a software infrastructure 
suite designed to support such surveys, focusing particularly on ImageHealth, a tool for near- 
real-time processing of large images. These image manipulation and analysis algorithms were 
applied to simulated data from the Dark Energy Survey, as well as observed data collected by 
the Y4KCam on the CTIO Im telescope and the Mosaic camera on the Blanco telescope. The 
accuracy and speed of the ImageHealth code in particular were benchmarked against results from 
SourceExtr actor, a standard image analysis tool ubiquitous in the astronomical community. Im- 
ageHealth is shown to provide comparable accuracy to SourceExtractor when examining bright 
objects in the focal plane, but with significantly shorter execution time. Based on the importance 
of real-time analysis in reaching the Dark Energy Survey's science goals, ImageHealth and other 
aspects of this analysis package were incorporated (in modified form) into the Survey Image Sys- 
tem Process Integration, the Dark Energy Camera software control environment. The original 
ImageHealth code, however, is completely instrument-independent, and is freely available for use 
within other observational data-taking environments. 

Subject headings: Methods: data analysis - Techniques: image processing - Surveys 



1. The Need for Real-Time Analysis of 
Large Astronomical Datasets 

Forthcoming instruments intended to be used 
for large-area high-cadence s urveys such as th e 
Dark Energy Survey, or DES (jAnnis et al. 20051 ). 
and the Large Syn optic Survey Telescope, or LSST 
( Abell et al. 20091 ) . will potentially collect enor- 
mous amounts of data in an entirely automated 
fashion; therefore, observers need automated tools 
to analyze the performance of the instrument in 
real time. The focal planes of these instruments 
are so large-DES images, for example, are of or- 



der 3 deg^ and 1 GB in size (see Figs. [Ij, while 
LSST images will be a factor of several larger; fur- 
thermore, the cadence of these instruments is very 
rapid-DES images will be acquired approximately 
every two minutes over an entire 8- to 12-hour 
night, while LSST images will be acquired at an 
even faster cadence. Therefore, these tools must 
swiftly analyze a significant amount of information 
in a manner that facilitates immediate identifica- 
tion of problematic data and prompt recovery in 
the event of observing malfunctions. For example, 
if the telescope loses proper focus, all subsequent 
data taken by the survey instrument would be un- 



usable for the key science analyses-especially the 
most sensitive ones, such as Weak Lensing. Ofhine 
analysis of images by observers has no set cadence, 
so without an automated image quality analysis, 
significant observing resources could be wasted be- 
fore human intervention identifies and corrects any 
problems. In light of this need, a suite of software 
tools has been devised to fulfill the goal of rapid- 
response, "quick and dirty" image analysis. 

2. Image Analysis for the Dark Energy 
Survey and Beyond 

The "central nervous system" of the Dark En- 
ergy Camera (DECam), used for controlling all 
aspects of the instrument and providing images, 
telemetry, and other feedback to observers, is 
known as the Sur vey Image System Proc ess Inte- 
gration, or SISPI (iHonsc heid et al. 20081 ). Figure 
[2] shows a schematic representation of the SISPI 
components and the flow of data. Several inde- 
pendent image manipulation and analysis modules 
have been incorporated into the Observer Control 
and Instrument Control aspects of SISPI in or- 
der to provide near-real-time information to ob- 
servers. These tools include a Real Time Dis- 
play, which provides a static, compressed (from 
1GB to 4MB) focal plane image within Is; In- 
strument Health, which accumulates telemetry in- 
formation and prepares time-history plots of data 
important to maintaining the integrity of the ob- 
serving process; Quick Reduce, which performs 
more sophisticated image processing (e.g., astrom- 
etry and photometry) on a selection of images 
within the dataflow, where depth of analysis is 
more important than immediate feedback; and Im- 
ageHealth, which performs a very fast check of 
CC Ds similar to the widely -used Source Extrac- 
tor (jBertin fc Arnouts 19961 ). but more stream- 
lined in execution and output. In addition to all of 
the tools present within the camera infrastructure, 
separate "off-line" software has been developed for 
the full analysis of all data obtained for the Dark 
Energy Survey. 

The various software subsystems within SISPI 
will be described in forthcoming publications of 
the DES Collaboration; in the rest of this work, we 
focus on the specific software tool designed for the 
broadest applicability outside of the DES: the Im- 
ageHealth algorithm. Section 3 describes the steps 



of the algorithm, while Section 4 describes the va- 
riety of user-defined input parameters that can be 
modified for algorithm performance optimization 
in a wide variety of observational settings. Section 
5 provides instructions on usage and the describes 
format of the code output. Section 6 provides a 
quantitative analysis of ImageHealth results, and 
Section 7 provides the conclusions drawn from this 
analysis. 

3. The Image Health Algorithm 

ImageHealth (IH) is designed to rapidly identify 
relatively bright objects within an image, and then 
swiftly determine a few specific image- and object- 
based parameters that can diagnose the overall 
quality of the image. While the DES-specific ver- 
sion has undergone modification for integration 
into the Survey's software infrastructure, in its 
most basic, instrument-independent form Image- 
Health is a C program th at incorporates the stan- 
dard CFITSIO library (jPence et al. 19991) . and 
executes the following steps: 

• Opens image from a file, receives relevant 
FITS image header keyword values (e.g., sat- 
uration, axis size), and determines the num- 
ber of extensions contained in the file. 

• Performs the following over all the image ex- 
tensions two times-once for the right half 
and once for the left half of each image ex- 
tension: 

— Reads pixel values to a data structure. 

— Finds mean of (non-saturated, non- 
dead) pixel values, including either ev- 
ery pixel or every N*'' pixel (as deter- 
mined by the "pixel skip" value). 

— Determines location of a bright pixel, 
which becomes the first (seed) pixel of 
the "Object". 

— Finds all pixels associated with that 
Object. If any associated pixels are sat- 
urated, a new Object is selected and the 
old Object is discarded. 

— Calculates Object (and full image) 
statistics: Background counts near Ob- 
ject and Mean Sky Background counts 
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over half of the image extension; Num- 
ber of Dead, Saturated, and Good pix- 
els; Object Flux (in counts); Object 
Size (FWHM, in pixels); Object Ma- 
jor and Minor Axis (in pixels); and 
Object Ellipticity and Orientation An- 
gle of Object ellipse. This narrowly- 
defined set of parameters provides the 
basis for effectively diagnosing some of 
the most common scenarios from which 
out-of-tolerance observations will arise, 
including: voltage or thermal control 
failures in the instrumentation, sub- 
optimal focus settings, optical element 
mis- alignment, and poor atmospheric 
seeing (among others). 

• Finally, IH outputs quantities for Imagc-by- 
Image and Time History Displays. In par- 
ticular, anomalous quantities can be tagged 
to alert the user that intervention in the au- 
tomated observing process is required. 

Sometimes objects do not exist in a given im- 
age region, or they are too faint for an algo- 
rithm geared toward the detection of bright ob- 
jects. Though in general 100% coverage is not 
achieved by IH (that is, a suitable Object is not al- 
ways identified in every FITS extension) , the num- 
ber of Objects found is clearly sufficient to broadly 
characterize the overall performance of the instru- 
ment and the general quality of any given image. 
For example, DES images are recorded by 124 sep- 
arate readouts (2 for each of 62 CCDs/FITS ex- 
tensions); out of the 124 distinct iterations of the 
IH algorithm applied to these image subregions, 
wlOO Objects (on at least separate 50 CCDs cov- 
ering 80% of the focal plane) are routinely discov- 
ered, with the exact value being mildly filter- and 
threshold-dependent . 

4. User-Defined Parameters of the Image 
Health Algorithm 

Prior to compiling and executing imagehealth . c, 
the user can define several parameters within the 
code. Some of these parameters are instrument- 
specific and are not likely to be changed otherwise, 
such as: 

• OVERSCAN: Size of the overscan region (in 
pixels) on each chip; this region is ignored 



by the algorithm. 

• DEAD: Minimum pixel value (in ADU); pixels 
below this value are considered dead and are 
not incorporated into any calculations per- 
formed by the code. 

• SATVAL: Pixel Saturation Value (in ADU); 
pixels above this value are not incorporated 
into any calculations performed by the code, 
and Objects with saturated pixels are ig- 
nored by the algorithm. Setting this param- 
eter to causes the algorithm to read the 
saturation value from the FITS Header. 

• MINPIX: Minimum Object Size (in pixels); 

objects containing fewer pixels above a cer- 
tain limit (see below) than this number are 
ignored by the algorithm. Setting this pa- 
rameter to 6 or more (depending on the plate 
scale) provides effective rejection of potential 
cosmic rays without eliminating a significant 
number of bright Objects useful to this al- 
gorithm. 

• NUMPIX: Object Searchbin Size (in pixels) de- 
termines 1/2 of the side length of the rectan- 
gular search area for pixels that may be con- 
sidered part of the Object. The search area 
is centered on the Object Seed pixel. This 
parameter is commonly set to 12, which will 
completely contain most Objects (depending 
on the plate scale) , unless the instrument fo- 
cus is egregiously bad or the Object itself is 
widely extended across the sky. 

Other parameters are likely to be changed more 
regularly by users, depending on their specific 
goals and the nature of the data that they are 
analyzing. These include: 

• NMOVE: The number of FITS extensions in- 
cremented before each subsequent iteration 
of the code. Unless the user has a reason 
to skip the analysis of some portion of an 
image, this should be set to 1-i.e. each sub- 
sequent iteration of the code acts on (cur- 
rent extension + 1). If there is only a sin- 
gle extension to the FITS file, regardless of 
the value of NMOVE the algorithm is ex- 
ecuted on that extension and then Image- 
Health closes normally (as it does whenever 
it reaches the end of a FITS file). 
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• THRESHOLD: Object Seed Threshold, com- 
monly set to 10, determines the multiplica- 
tive factor above the mean sky background 
that a pixel must have in order for it to serve 
as the seed for an Object. Lower values such 
as 5 or even 3 can be used if the field is pop- 
ulated only by fainter objects that do not re- 
sult in sufficient Objects for analysis of the 
complete focal plane. 

• OBJTHRESH: Object Pixel Threshold, the 
value in sigma above the image mean re- 
quired for any pixel to be considered part of 
an Object (rather than a Background pixel). 
A pixel threshold of 5 results in mean photo- 
metric accuracy of ~0.5% (see Section 6) for 
the bright objects that this algorithm is de- 
signed to find. Fainter objects or those with 
broad wings are likely to be less accurately 
measured, as there could be several pixels 
that in reality should be considered part of 
the Object, but are below the default pa- 
rameter value. In cases where many such 
Objects are found, a smaller value for this 
parameter (like 3) may be more appropriate. 

• PIXSTEP: "FuU Counting" or "Pixel Skip- 
ping" determines whether or not every pixel 
is read into the data structure. Common 
pixel skip values are 2, 4, or 8, which allows 
the algorithm to execute faster with minimal 
sacrifice of accuracy. The default value of 1 
results in every pixel being counted. 

• NLOOPS: Limit to Number of Potential Ob- 
jects which may be found before the algo- 
rithm proceeds to the next iteration. If this 
number is exceeded, the algorithm's current 
iteration is terminated and it begins to ex- 
amine the next subregion or FITS extension. 
This is useful for constraining the comput- 
ing time expended on an image extension 
which may have many objects, but few which 
would be useful for the IH algorithm. Com- 
bined with the thresholds defined above, an 
NLOOPS of 6 results in Objects found in a 
high fraction of all extensions for a wide va- 
riety of test images. 

While the constant starting pixel position of 
the search algorithm could potentially introduce 
a bias in the image location of Objects found by 



IH (e.g., all Objects could be clustered in one cor- 
ner of each CCD), in practice the Objects are rel- 
atively evenly distributed throughout the CCDs- 
see Fig.[3]for a representative sample of Object po- 
sitions. If users discover unacceptably nonuniform 
distributions of Objects, then setting THRESHOLD, 
NLOOPS, or PIXSTEP to higher values will likely re- 
sult in different (and more uniform) Object selec- 
tion. Source crowding or image seeing may also 
impact the choice of parameter values, especially 
PIXSTEP (or perhaps NUMPIX), though optimal pa- 
rameter values under non-standard observing con- 
ditions are best determined by users on a case-by- 
case basis. 

5. Using Image Health 

Once the user has set the desired values for 
all parameters via the #def ine statements within 
imagehealth . c, the program is compiled with (for 
example) gcc: 

7. gcc -Im -Icfitsio -L/CFITSIO/PATH -o ih 

and then executed from the command line: 
7, ./ih inputf ile . f its outputfile.dat 

If the syntax is not followed appropriately, an 
error message will be generated describing the 
proper syntax. Note that the compile step requires 
the CFITSIO libraries to be properly installed 
(/CFITSIO/PATH should also be altered to refiect 
the user's actual directory structure). The exe- 
cute step further assumes that inputf ile .fits is 
in the directory where the code is executed; if it 
is not, then the appropriate path and filename 
should be specified by the user. 

By default during execution, a variety of sta- 
tus, warning, or error messages are printed to 
the specified output file, though if user does not 
wish to see them they are trivial to comment out. 
Likewise, the quantities calculated for each half 
or each FITS extension (number of dead or satu- 
rated or good pixels, sky background) as well as 
the quantities for the Object (X/Y position, local 
background, flux, FWHM, major and minor axis 
lengths, cllipticity and orientation angle) found on 
each half of each extension are printed to the spec- 
ified output file. At the conclusion of the execu- 
tion, the mean FWHM of all Objects is returned. 
This single number provides a useful summary of 
the characteristics of the entire image, though the 
user will likely want to examine the output file in 
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detail to determine the true health of the image 
and its constituent extensions. 

In the next section we describe the comparison 
of the Image Health algorithm to that of another 
common image analysis tool, Source Extractor. 

6. Comparison of ImageHealth with Source 
Extractor 

Source Extractor (SE) is a standard image anal- 
ysis and object-finding software package used for a 
wide variety of tasks by many in the astronomical 
community, and it also serves as our benchmark 
for comparison to the results of the ImageHealth 
algorithm. A nalysis of hundreds of DES simu- 
lated images (jKuropatkin et al. 20121 ) with both 
ImageHealth (using the default settings intended 
primarily for the evaluation of bright Objects) and 
Source Extractor show very good agreement in 
their outputs; see Tabled] for details. The supple- 
mental code (beyond imagehealth.c) that per- 
formed these large-scale comparisons between the 
two packages can be provided upon request from 
the authors. 

While IH does not have the versatility of SE, 
it does exhibit comparable performance for the 
specific parameters calculated, provided there are 
sufficient numbers of bright stars in each im- 
age. Additionally, the handful of input param- 
eters and single mode of execution gives IH not 
only greater user-friendliness than SE, but also fa- 
cilitates greater modification of the algorithm and 
its outputs by any and all users. ImageHealth is 
comprised of only 600 lines of code, while the nu- 
merous different aspects of Source Extractor total 
1700 lines. Furthermore, IH is written modularly, 
with distinct and clearly delineated operations for 
file I/O and object analysis-processes which, in 
Source Extractor, are intertwined in a matter not 
at all transparent to the user. Finally, IH offers 
streamlined processing with more focused output: 
for those applications where execution time is a 
significant factor, it is worth noting that on a stan- 
dard (2.2 GHz) desktop machine, IH executes in 
30 seconds in full counting mode, or as little as six 
seconds in pixel-skipping mode. On the same ma- 
chine. Source Extractor takes 70 seconds to step 
through a full (wlGB) DES image. In the context 
of the Dark Energy Survey observing cadence (im- 
ages acquired every two minutes or less), IH will 



determine the quality of image N-and allow sig- 
nificant time for observer intervention-well before 
image N-l-1 is read out. Thus, in the event of errors 
in the observation or problems with the data, at 
most a single image (and two minutes of observ- 
ing time) is lost. The time differential (up to a 
factor of 10) between Source Extractor and IH ex- 
ecution may be even more crucial for Community 
Users of the DECam, who will have a wide vari- 
ety of observational requirements, including, per- 
haps, even faster cadences. Similarly, LSST, with 
individual images several times larger than DES 
images and observations occurring at a faster ca- 
dence than DES, could benefit from implementing 
ImageHealth over Source Extractor for real-time 
image analysis solutions where only a few select 
output parameters are required. 

While the ImageHealth algorithm has been 
tested on many (simulated) DES images, it is not 
instrument-sp ecific. Th e algorithm was also tested 
on Y4KCam (jPogge 2009.) images, both from sci- 
ence and engineering observing periods, and per- 
formed with comparable accuracy and speed to 
the tests on simulated DES images. Since the 
DES and Y4KCam datasets primarily comprised 
simulated or observed "good" images, additional 
tests were performed on images of the Coma Clus- 
ter taken with the Mosaic camera on the Blanco 
telescope that were known to be of poor quality. 
While the former tests show that "false positive" 
mis-identification of problems is avoided by IH, the 
most important result from this final test is that 
"false negative" mis-identification of truly prob- 
lematic images is likewise avoided. Specifically, 
low quality, large FWHM images were readily and 
routinely identified as such by the ImageHealth 
algorithm. Subsequent to the completion of these 
formal tests, ImageHealth was also explored as a 
source of real-time feedba ck for the MODS instru- 
ment (jOsmer et al. 2000l ). There are even image 
processing applications outside the field of astron- 
omy (e.g., in mechanical and electrical engineer- 
ing) for which the swift and accurate processing of 
large datasets that ImageHealth offers is propose d 
as the most appropriate solution (|Mahboob 20091 ) . 

7. Conclusions 

While Source Extractor is a broadly useful tool 
for the astronomical community, ImageHealth is 
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Parameter 


Mean IH-SE 


Mean IH-SE (Pixel Skip = 8) 


Units 


Position 


1.8 


1.72 


pixels 


Object BG 


0.35 


0.24 


% of counts 


Object Flux 


0.53 


0.68 


% of counts 


FWHM 


0.22 


0.24 


pixels 


EUipticity 


0.03 


0.04 




Major Axis Orientation Angle 


8.6 


7.4 


degrees 



Table 1: Sample Results of the ImageHealth algorithm and identical parameters determined by Source 
Extractor (SE) 



shown to be the superior application specifically 
for observers who require real-time feedback on 
a well-defined set of parameters useful for de- 
termining image quality and instrument perfor- 
mance. This code has been modified for use 
within the Dark Energy Survey's Survey Image 
System Process Integration, the software infras- 
tructure that will be used by the Dark Energy 
Survey and all Community Users of DECam. Fur- 
thermore, the instrument-independent version of 
the code is freely avai lable from the As trophysics 
Source Code Library ([Allen et al. 20121 ) to all as- 
tronomers desiring to perform real-time analysis of 
large-scale observations using completely different 
instruments. 

The authors wish to thank Professor Rick 
Pogge of The Ohio State University Department 
of Astronomy for useful feedback on early stages of 
the project, as well as access to archival Y4KCam 
data used in the testing of the ImageHealth algo- 
rithm. Thanks as well to Professor Klaus Hon- 
scheid for his support of these efforts, and to 
Professor Alex Small and Dr. Lisa Gerhardt for 
productive comments on early drafts of this work. 
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Fig. 1. — A photograph of the 570 Megapixels of 
the DES focal plane prior to installation on the 
Blanco telescope at CTIO. 
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Fig. 3. — X and Y positions of Objects found 
by the ImageHealth Algorithm on 2048x4096 DES 
CCDs. 
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Fig. 2. — A schematic of the SISPI components 
and the dataflow for DES data-taking. 
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